Evolutionary Tree Based on Oligonucleotide Frequencies and Conservative Words in 16S and 18S Ribosomal RNA

نویسندگان

  • Li-Ching Hsieh
  • Chih-Yuan Tseng
  • Liaofu Luo
  • Mingwen Jia
  • Fengmin Ji
  • Hoong-Chien Lee
چکیده

Sequence distances are defined in terms of the differences in the occurrence frequencies in sequences of oligonucleotides of length n. Such n-distances are used to construct phylogenetic trees from a set of thirty-five 16S or 18S rRNA sequences. The quality of the trees generally improves with increasing n and reaches a plateau at n=7 or 8. The best n-distance trees are compatible to trees based on sequence alignment, suggesting that highly overrepresented 7-mers and 8-mers are closely related to rRNA evolution. Out of the 47=16384 7-mers, 612 are identified as those whose relative frequencies correlate strongly with the 35×35 n-distance matrix. These evolution-related 7-mers are used to identify “conservative words”, oligonucleotides whose frequencies and loci are common to at least 85% of organisms preselected to represent a domain. The structural meaning of some of these conservative words is discussed.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Evolutionary Tree Based on Oligonucleotide Frequencies and Conserved Words in 16S and 18S Ribosomal RNA

Sequence distances are defined in terms of the differences in the oligonucleotide frequencies of length n. Such n-distances are used to construct phylogenetic trees from a set of thirty-five 16S (18S) rRNA sequences. The quality of the trees generally improves with increasing n and reaches a plateau at n=7 or 8. The best n-distance trees are compatible to trees based on sequence alignment, sugg...

متن کامل

Submit to Gene (a draft) Evolutionary Tree Reconstructed Based on Oligonucleotide Frequencies And Conserved Words in 16s Ribosomal RNA

Evolutionary distance is defined by oligonucleotide (n-bases) frequency difference of two sequences. Phylogenetic tree is reconstructed using a set of 16S (18S ) rRNA sequences and the definition of distance. The quality of trees generally improves with increasing n and reaches a plateau of best fit at n=7 or 8. So, the 7-mer or 8-mer frequencies provides a basis to describe rRNA evolution. The...

متن کامل

A comparative phylogenetic analysis of Theileria spp. by using two two "18S ribosomal RNA" and "Theileria annulata merozoite surface antigen" gene sequences

More than 185 species, strains and unclassified Theileria parasites are categorized in the Entrez Taxonomy. The accurate diagnosis and proper identification of the causative agents are important for understanding the epidemiology, prevention and appropriate treatment. This study aims to discuss the importance of two genes of Theileria annulata 18S ribosomal RNA (18S rRNA) and Theileria annulata...

متن کامل

Search for Evolution-Related-Oligonucleotides and Conservative Words in rRNA Sequences

We describe a method for finding ungapped conserved words in rRNA sequences that is effective, utilizes evolutionary information and does not depend on multiple sequence alignment. Evolutionary distance (called ndistance) between a pair of 16S or 18S rRNA sequences is defined in terms of the difference in the two sets of frequencies of occurrence of oligonucleotides n bases long (n-mers) given ...

متن کامل

Phylogenetic analysis of genes coding for 16S rRNA in mammalian ureaplasmas.

Phylogenetic relationships among species of the genus Ureaplasma were elucidated by analyzing 16S rRNA sequence information. The 16S rRNA genes of six strains of the mammalian Ureaplasma species were amplified by PCR and were sequenced directly by a primer walking method. The phylogenetic tree based on the nucleotide sequence of the 16S rRNA genes corresponded to the evolutionary history of the...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004